Bayesian regression based on principal components for high-dimensional data

نویسندگان

  • Jae Yong Lee
  • Hee-Seok Oh
چکیده

Motivated by a climate prediction problem, we consider high dimensional Bayesian regression where the number of covariates is much larger than the number of observations. To reduce the dimension of the covariate, the response is regressed on the principal components obtained from the covariates, and it is argued that the PCA regression is equivalent to the original model in terms of prediction. In the PCA regression setting under the sparsity condition, we examine large sample properties of two different modeling strategies: regression with and without covariate selection. For the regression without covariate selection, we obtain the consistency results of the estimators and posteriors with normal priors with constant and decreasing variances, and James-Stein estimator; for the regression with covariate selection, we obtain convergence rates of Bayesian model averaging (BMA) and median probability model (MPM) estimators, and the posterior with variable selection prior. Based on the large sample properties, we conclude that variable selection is essential in high dimensional Bayesian regression. A simulation study also confirms the conclusion. The methodologies are applied to a climate prediction problem. 1 AMS 2000 subject classifications: Primary 62C10; secondary 62J05

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multivariate Bayesian Kernel Regression Model for High Dimensional Data and its Practical Applications in Near Infrared (NIR) Spectroscopy

Non-linear regression based on reproducing kernel Hilbert space (RKHS) has recently become very popular in fitting high-dimensional data. The RKHS formulation provides an automatic dimension reduction of the covariates. This is particularly helpful when the number of covariates ($p$) far exceed the number of data points. In this paper, we introduce a Bayesian nonlinear multivariate regression m...

متن کامل

Methods for regression analysis in high-dimensional data

By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...

متن کامل

Bayesian Factor Regression Models in the “Large p, Small n” Paradigm

I discuss Bayesian factor regression models and prediction with very many explanatory variables. Such problems arise in many areas; my motivating applications are in studies of gene expression in functional genomics. I first discuss empirical factor (principal components) regression, and the use of general classes of shrinkage priors, with an example. These models raise foundational questions f...

متن کامل

Estimation of Variance Components for Body Weight of Moghani Sheep Using B-Spline Random Regression Models

The aim of the present study was the estimation of (co) variance components and genetic parameters for body weight of Moghani sheep, using random regression models based on B-Splines functions. The data set included 9165 body weight records from 60 to 360 days of age from 2811 Moghani sheep, collected between 1994 to 2013 from Jafar-Abad Animal Research and Breeding Institute, Ardabil province,...

متن کامل

Persian Handwriting Analysis Using Functional Principal Components

Principal components analysis is a well-known statistical method in dealing with large dependent data sets. It is also used in functional data for both purposes of data reduction as well as variation representation. On the other hand "handwriting" is one of the objects, studied in various statistical fields like pattern recognition and shape analysis. Considering time as the argument,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Multivariate Analysis

دوره 117  شماره 

صفحات  -

تاریخ انتشار 2013